Overview

Dataset statistics

Number of variables18
Number of observations116147
Missing cells68136
Missing cells (%)3.3%
Duplicate rows8369
Duplicate rows (%)7.2%
Total size in memory61.6 MiB
Average record size in memory556.3 B

Variable types

Categorical7
Numeric11

Warnings

Dataset has 8369 (7.2%) duplicate rows Duplicates
nearest_mrt has a high cardinality: 137 distinct values High cardinality
nearest_mall has a high cardinality: 137 distinct values High cardinality
Price ($) is highly correlated with Area (Sqft)High correlation
Area (Sqft) is highly correlated with Price ($)High correlation
tenure_yrs_clean is highly correlated with lease_commencement and 1 other fieldsHigh correlation
lease_commencement is highly correlated with tenure_yrs_cleanHigh correlation
remaining_lease is highly correlated with tenure_yrs_cleanHigh correlation
Type is highly correlated with Type of AreaHigh correlation
Floor Level is highly correlated with Type of AreaHigh correlation
Type of Area is highly correlated with Type and 1 other fieldsHigh correlation
lease_commencement has 32926 (28.3%) missing values Missing
remaining_lease has 32926 (28.3%) missing values Missing
Price ($) is highly skewed (γ1 = 70.42264601) Skewed
Area (Sqft) is highly skewed (γ1 = 85.0509881) Skewed

Reproduction

Analysis started2021-04-01 16:43:49.130533
Analysis finished2021-04-01 16:44:35.903815
Duration46.77 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

Type
Categorical

HIGH CORRELATION

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
Condominium
52828 
Apartment
39804 
Executive Condominium
13666 
Terrace
 
4808
Semi-detached
 
2426
Other values (4)
 
2615

Length

Max length21
Median length11
Mean length11.3940868
Min length7

Characters and Unicode

Total characters1323389
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowExecutive Condominium
2nd rowCondominium
3rd rowExecutive Condominium
4th rowCondominium
5th rowCondominium
ValueCountFrequency (%)
Condominium52828
45.5%
Apartment39804
34.3%
Executive Condominium13666
 
11.8%
Terrace4808
 
4.1%
Semi-detached2426
 
2.1%
Strata Terrace1225
 
1.1%
Detached1055
 
0.9%
Strata Semi-detached250
 
0.2%
Strata Detached85
 
0.1%
2021-04-02T00:44:36.365287image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
2021-04-02T00:44:36.548684image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
condominium66494
50.6%
apartment39804
30.3%
executive13666
 
10.4%
terrace6033
 
4.6%
semi-detached2676
 
2.0%
strata1560
 
1.2%
detached1140
 
0.9%

Most occurring characters

ValueCountFrequency (%)
m175468
13.3%
n172792
13.1%
i149330
11.3%
o132988
10.0%
t100210
7.6%
e89510
 
6.8%
u80160
 
6.1%
d72986
 
5.5%
C66494
 
5.0%
r53430
 
4.0%
Other values (13)230021
17.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1174114
88.7%
Uppercase Letter131373
 
9.9%
Space Separator15226
 
1.2%
Dash Punctuation2676
 
0.2%

Most frequent character per category

ValueCountFrequency (%)
m175468
14.9%
n172792
14.7%
i149330
12.7%
o132988
11.3%
t100210
8.5%
e89510
7.6%
u80160
6.8%
d72986
6.2%
r53430
 
4.6%
a52773
 
4.5%
Other values (5)94467
8.0%
ValueCountFrequency (%)
C66494
50.6%
A39804
30.3%
E13666
 
10.4%
T6033
 
4.6%
S4236
 
3.2%
D1140
 
0.9%
ValueCountFrequency (%)
15226
100.0%
ValueCountFrequency (%)
-2676
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1305487
98.6%
Common17902
 
1.4%

Most frequent character per script

ValueCountFrequency (%)
m175468
13.4%
n172792
13.2%
i149330
11.4%
o132988
10.2%
t100210
7.7%
e89510
6.9%
u80160
 
6.1%
d72986
 
5.6%
C66494
 
5.1%
r53430
 
4.1%
Other values (11)212119
16.2%
ValueCountFrequency (%)
15226
85.1%
-2676
 
14.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1323389
100.0%

Most frequent character per block

ValueCountFrequency (%)
m175468
13.3%
n172792
13.1%
i149330
11.3%
o132988
10.0%
t100210
7.6%
e89510
 
6.8%
u80160
 
6.1%
d72986
 
5.5%
C66494
 
5.0%
r53430
 
4.0%
Other values (13)230021
17.4%

Postal District
Real number (ℝ≥0)

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.28724806
Minimum1
Maximum28
Zeros0
Zeros (%)0.0%
Memory size907.5 KiB
2021-04-02T00:44:36.855535image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q110
median16
Q319
95-th percentile27
Maximum28
Range27
Interquartile range (IQR)9

Descriptive statistics

Standard deviation7.004070996
Coefficient of variation (CV)0.4581642798
Kurtosis-0.7754186386
Mean15.28724806
Median Absolute Deviation (MAD)5
Skewness-0.2162551496
Sum1775568
Variance49.05701051
MonotocityNot monotonic
2021-04-02T00:44:37.035665image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
1917723
15.3%
58125
 
7.0%
157759
 
6.7%
187680
 
6.6%
237022
 
6.0%
36753
 
5.8%
146383
 
5.5%
106182
 
5.3%
275745
 
4.9%
95321
 
4.6%
Other values (17)37454
32.2%
ValueCountFrequency (%)
11077
 
0.9%
21010
 
0.9%
36753
5.8%
41595
 
1.4%
58125
7.0%
65
 
< 0.1%
7902
 
0.8%
81281
 
1.1%
95321
4.6%
106182
5.3%
ValueCountFrequency (%)
282960
 
2.5%
275745
 
4.9%
26922
 
0.8%
251604
 
1.4%
237022
 
6.0%
222591
 
2.2%
214214
 
3.6%
203853
 
3.3%
1917723
15.3%
187680
6.6%

Market Segment
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.6 MiB
OCR
64755 
RCR
34582 
CCR
16810 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters348441
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOCR
2nd rowOCR
3rd rowOCR
4th rowOCR
5th rowOCR
ValueCountFrequency (%)
OCR64755
55.8%
RCR34582
29.8%
CCR16810
 
14.5%
2021-04-02T00:44:37.657866image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
2021-04-02T00:44:37.804107image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
ocr64755
55.8%
rcr34582
29.8%
ccr16810
 
14.5%

Most occurring characters

ValueCountFrequency (%)
R150729
43.3%
C132957
38.2%
O64755
18.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter348441
100.0%

Most frequent character per category

ValueCountFrequency (%)
R150729
43.3%
C132957
38.2%
O64755
18.6%

Most occurring scripts

ValueCountFrequency (%)
Latin348441
100.0%

Most frequent character per script

ValueCountFrequency (%)
R150729
43.3%
C132957
38.2%
O64755
18.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII348441
100.0%

Most frequent character per block

ValueCountFrequency (%)
R150729
43.3%
C132957
38.2%
O64755
18.6%

Type of Sale
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.1 MiB
Resale
58397 
New Sale
56162 
Sub Sale
 
1588

Length

Max length8
Median length6
Mean length6.994429473
Min length6

Characters and Unicode

Total characters812382
Distinct characters11
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowResale
2nd rowResale
3rd rowResale
4th rowResale
5th rowResale
ValueCountFrequency (%)
Resale58397
50.3%
New Sale56162
48.4%
Sub Sale1588
 
1.4%
2021-04-02T00:44:38.213419image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
2021-04-02T00:44:38.409802image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
resale58397
33.6%
sale57750
33.2%
new56162
32.3%
sub1588
 
0.9%

Most occurring characters

ValueCountFrequency (%)
e230706
28.4%
a116147
14.3%
l116147
14.3%
S59338
 
7.3%
R58397
 
7.2%
s58397
 
7.2%
57750
 
7.1%
N56162
 
6.9%
w56162
 
6.9%
u1588
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter580735
71.5%
Uppercase Letter173897
 
21.4%
Space Separator57750
 
7.1%

Most frequent character per category

ValueCountFrequency (%)
e230706
39.7%
a116147
20.0%
l116147
20.0%
s58397
 
10.1%
w56162
 
9.7%
u1588
 
0.3%
b1588
 
0.3%
ValueCountFrequency (%)
S59338
34.1%
R58397
33.6%
N56162
32.3%
ValueCountFrequency (%)
57750
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin754632
92.9%
Common57750
 
7.1%

Most frequent character per script

ValueCountFrequency (%)
e230706
30.6%
a116147
15.4%
l116147
15.4%
S59338
 
7.9%
R58397
 
7.7%
s58397
 
7.7%
N56162
 
7.4%
w56162
 
7.4%
u1588
 
0.2%
b1588
 
0.2%
ValueCountFrequency (%)
57750
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII812382
100.0%

Most frequent character per block

ValueCountFrequency (%)
e230706
28.4%
a116147
14.3%
l116147
14.3%
S59338
 
7.3%
R58397
 
7.2%
s58397
 
7.2%
57750
 
7.1%
N56162
 
6.9%
w56162
 
6.9%
u1588
 
0.2%

Price ($)
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct19922
Distinct (%)17.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1839632.184
Minimum40000
Maximum980000000
Zeros0
Zeros (%)0.0%
Memory size907.5 KiB
2021-04-02T00:44:38.693143image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum40000
5-th percentile686000
Q1932000
median1254000
Q31772985
95-th percentile3850000
Maximum980000000
Range979960000
Interquartile range (IQR)840985

Descriptive statistics

Standard deviation9365558.147
Coefficient of variation (CV)5.090994943
Kurtosis5853.210042
Mean1839632.184
Median Absolute Deviation (MAD)376200
Skewness70.42264601
Sum2.136677593 × 1011
Variance8.771367941 × 1013
MonotocityNot monotonic
2021-04-02T00:44:39.133459image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1200000708
 
0.6%
1100000660
 
0.6%
1300000605
 
0.5%
1050000595
 
0.5%
1500000580
 
0.5%
1150000522
 
0.4%
1250000499
 
0.4%
1400000497
 
0.4%
1600000491
 
0.4%
1180000475
 
0.4%
Other values (19912)110515
95.2%
ValueCountFrequency (%)
400001
< 0.1%
500002
< 0.1%
630001
< 0.1%
2880001
< 0.1%
3000001
< 0.1%
3300002
< 0.1%
3560001
< 0.1%
3580001
< 0.1%
3600001
< 0.1%
3620001
< 0.1%
ValueCountFrequency (%)
9800000001
< 0.1%
9700000001
< 0.1%
9068890001
< 0.1%
9067000001
< 0.1%
8408888881
< 0.1%
7657818191
< 0.1%
7280000001
< 0.1%
6380000001
< 0.1%
6290000001
< 0.1%
6100000001
< 0.1%

Area (Sqft)
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct3761
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1349.925302
Minimum258
Maximum947081
Zeros0
Zeros (%)0.0%
Memory size907.5 KiB
2021-04-02T00:44:39.388253image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum258
5-th percentile474
Q1721
median1033
Q31346
95-th percentile2888
Maximum947081
Range946823
Interquartile range (IQR)625

Descriptive statistics

Standard deviation6217.933946
Coefficient of variation (CV)4.606131864
Kurtosis9000.076898
Mean1349.925302
Median Absolute Deviation (MAD)312
Skewness85.0509881
Sum156789774
Variance38662702.55
MonotocityNot monotonic
2021-04-02T00:44:39.672848image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9581610
 
1.4%
4631602
 
1.4%
9151526
 
1.3%
7641519
 
1.3%
10551473
 
1.3%
6781419
 
1.2%
11091416
 
1.2%
7001410
 
1.2%
4841394
 
1.2%
9041349
 
1.2%
Other values (3751)101429
87.3%
ValueCountFrequency (%)
2581
 
< 0.1%
32313
 
< 0.1%
33428
 
< 0.1%
34439
 
< 0.1%
35557
 
< 0.1%
366111
0.1%
37770
 
0.1%
388117
0.1%
398231
0.2%
409256
0.2%
ValueCountFrequency (%)
9470811
< 0.1%
6292631
< 0.1%
6010401
< 0.1%
5638291
< 0.1%
5585651
< 0.1%
5204071
< 0.1%
4794391
< 0.1%
4167501
< 0.1%
4051141
< 0.1%
3829191
< 0.1%

Type of Area
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.0 MiB
Strata
107837 
Land
 
8310

Length

Max length6
Median length6
Mean length5.856905473
Min length4

Characters and Unicode

Total characters680262
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowStrata
2nd rowStrata
3rd rowStrata
4th rowStrata
5th rowStrata
ValueCountFrequency (%)
Strata107837
92.8%
Land8310
 
7.2%
2021-04-02T00:44:40.169601image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
2021-04-02T00:44:40.445887image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
strata107837
92.8%
land8310
 
7.2%

Most occurring characters

ValueCountFrequency (%)
a223984
32.9%
t215674
31.7%
S107837
15.9%
r107837
15.9%
L8310
 
1.2%
n8310
 
1.2%
d8310
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter564115
82.9%
Uppercase Letter116147
 
17.1%

Most frequent character per category

ValueCountFrequency (%)
a223984
39.7%
t215674
38.2%
r107837
19.1%
n8310
 
1.5%
d8310
 
1.5%
ValueCountFrequency (%)
S107837
92.8%
L8310
 
7.2%

Most occurring scripts

ValueCountFrequency (%)
Latin680262
100.0%

Most frequent character per script

ValueCountFrequency (%)
a223984
32.9%
t215674
31.7%
S107837
15.9%
r107837
15.9%
L8310
 
1.2%
n8310
 
1.2%
d8310
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII680262
100.0%

Most frequent character per block

ValueCountFrequency (%)
a223984
32.9%
t215674
31.7%
S107837
15.9%
r107837
15.9%
L8310
 
1.2%
n8310
 
1.2%
d8310
 
1.2%

Floor Level
Categorical

HIGH CORRELATION

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.1 MiB
01 to 05
37908 
06 to 10
28128 
11 to 15
20111 
-
9913 
16 to 20
9475 
Other values (12)
10612 

Length

Max length8
Median length8
Mean length7.402558826
Min length1

Characters and Unicode

Total characters859785
Distinct characters13
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row01 to 05
2nd row11 to 15
3rd row11 to 15
4th row16 to 20
5th row16 to 20
ValueCountFrequency (%)
01 to 0537908
32.6%
06 to 1028128
24.2%
11 to 1520111
17.3%
-9913
 
8.5%
16 to 209475
 
8.2%
21 to 254422
 
3.8%
26 to 302714
 
2.3%
31 to 351942
 
1.7%
36 to 40933
 
0.8%
41 to 45319
 
0.3%
Other values (7)282
 
0.2%
2021-04-02T00:44:40.858399image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
to106234
32.3%
0137908
 
11.5%
0537908
 
11.5%
1028128
 
8.6%
0628128
 
8.6%
1520111
 
6.1%
1120111
 
6.1%
9913
 
3.0%
209475
 
2.9%
169475
 
2.9%
Other values (24)21224
 
6.5%

Most occurring characters

ValueCountFrequency (%)
212468
24.7%
0145373
16.9%
1142630
16.6%
t106234
12.4%
o106234
12.4%
565085
 
7.6%
641535
 
4.8%
221033
 
2.4%
-9913
 
1.2%
37531
 
0.9%
Other values (3)1749
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number424910
49.4%
Space Separator212468
24.7%
Lowercase Letter212468
24.7%
Dash Punctuation9913
 
1.2%
Uppercase Letter26
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
0145373
34.2%
1142630
33.6%
565085
15.3%
641535
 
9.8%
221033
 
4.9%
37531
 
1.8%
41694
 
0.4%
729
 
< 0.1%
ValueCountFrequency (%)
t106234
50.0%
o106234
50.0%
ValueCountFrequency (%)
212468
100.0%
ValueCountFrequency (%)
-9913
100.0%
ValueCountFrequency (%)
B26
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common647291
75.3%
Latin212494
 
24.7%

Most frequent character per script

ValueCountFrequency (%)
212468
32.8%
0145373
22.5%
1142630
22.0%
565085
 
10.1%
641535
 
6.4%
221033
 
3.2%
-9913
 
1.5%
37531
 
1.2%
41694
 
0.3%
729
 
< 0.1%
ValueCountFrequency (%)
t106234
50.0%
o106234
50.0%
B26
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII859785
100.0%

Most frequent character per block

ValueCountFrequency (%)
212468
24.7%
0145373
16.9%
1142630
16.6%
t106234
12.4%
o106234
12.4%
565085
 
7.6%
641535
 
4.8%
221033
 
2.4%
-9913
 
1.2%
37531
 
0.9%
Other values (3)1749
 
0.2%

Unit Price ($psf)
Real number (ℝ≥0)

Distinct3346
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1395.469044
Minimum33
Maximum5896
Zeros0
Zeros (%)0.0%
Memory size907.5 KiB
2021-04-02T00:44:41.226060image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum33
5-th percentile742
Q11011
median1339
Q31670
95-th percentile2399
Maximum5896
Range5863
Interquartile range (IQR)659

Descriptive statistics

Standard deviation519.7054367
Coefficient of variation (CV)0.3724234795
Kurtosis2.395190266
Mean1395.469044
Median Absolute Deviation (MAD)329
Skewness1.161015815
Sum162079543
Variance270093.7409
MonotocityNot monotonic
2021-04-02T00:44:41.437088image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
929328
 
0.3%
1394257
 
0.2%
1548227
 
0.2%
1161207
 
0.2%
1327175
 
0.2%
1239173
 
0.1%
1486165
 
0.1%
1858163
 
0.1%
1301162
 
0.1%
1000161
 
0.1%
Other values (3336)114129
98.3%
ValueCountFrequency (%)
331
< 0.1%
551
< 0.1%
591
< 0.1%
691
< 0.1%
1001
< 0.1%
1091
< 0.1%
1201
< 0.1%
1271
< 0.1%
1301
< 0.1%
1351
< 0.1%
ValueCountFrequency (%)
58961
< 0.1%
56331
< 0.1%
53051
< 0.1%
51251
< 0.1%
50501
< 0.1%
49871
< 0.1%
49361
< 0.1%
49271
< 0.1%
49131
< 0.1%
48991
< 0.1%

tenure_yrs_clean
Real number (ℝ≥0)

HIGH CORRELATION

Distinct14
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean346.7920064
Minimum60
Maximum900
Zeros0
Zeros (%)0.0%
Memory size907.5 KiB
2021-04-02T00:44:41.816706image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum60
5-th percentile99
Q199
median99
Q3900
95-th percentile900
Maximum900
Range840
Interquartile range (IQR)801

Descriptive statistics

Standard deviation370.2756669
Coefficient of variation (CV)1.067716845
Kurtosis-1.319813126
Mean346.7920064
Median Absolute Deviation (MAD)0
Skewness0.8247143536
Sum40277464
Variance137104.0695
MonotocityNot monotonic
2021-04-02T00:44:42.016485image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
9979663
68.6%
90035933
30.9%
103236
 
0.2%
6094
 
0.1%
10270
 
0.1%
10057
 
< 0.1%
7026
 
< 0.1%
11024
 
< 0.1%
10122
 
< 0.1%
858
 
< 0.1%
Other values (4)10
 
< 0.1%
ValueCountFrequency (%)
6094
 
0.1%
7026
 
< 0.1%
858
 
< 0.1%
893
 
< 0.1%
931
 
< 0.1%
971
 
< 0.1%
9979663
68.6%
10057
 
< 0.1%
10122
 
< 0.1%
10270
 
0.1%
ValueCountFrequency (%)
90035933
30.9%
11024
 
< 0.1%
1045
 
< 0.1%
103236
 
0.2%
10270
 
0.1%
10122
 
< 0.1%
10057
 
< 0.1%
9979663
68.6%
971
 
< 0.1%
931
 
< 0.1%

lease_commencement
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct90
Distinct (%)0.1%
Missing32926
Missing (%)28.3%
Infinite0
Infinite (%)0.0%
Mean2004.87246
Minimum1827
Maximum2020
Zeros0
Zeros (%)0.0%
Memory size907.5 KiB
2021-04-02T00:44:42.197528image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum1827
5-th percentile1956
Q12007
median2014
Q32017
95-th percentile2018
Maximum2020
Range193
Interquartile range (IQR)10

Descriptive statistics

Standard deviation28.26883727
Coefficient of variation (CV)0.01410006763
Kurtosis14.64618096
Mean2004.87246
Median Absolute Deviation (MAD)4
Skewness-3.842975042
Sum166847491
Variance799.1271604
MonotocityNot monotonic
2021-04-02T00:44:42.651082image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
201813139
 
11.3%
20148043
 
6.9%
20157360
 
6.3%
20166049
 
5.2%
20175054
 
4.4%
20135026
 
4.3%
20114311
 
3.7%
20123753
 
3.2%
20103383
 
2.9%
20193113
 
2.7%
Other values (80)23990
20.7%
(Missing)32926
28.3%
ValueCountFrequency (%)
182734
 
< 0.1%
18352
 
< 0.1%
1841201
0.2%
187429
 
< 0.1%
1875154
 
0.1%
1876190
0.2%
1877470
0.4%
1878185
 
0.2%
1879458
0.4%
188127
 
< 0.1%
ValueCountFrequency (%)
2020117
 
0.1%
20193113
 
2.7%
201813139
11.3%
20175054
 
4.4%
20166049
5.2%
20157360
6.3%
20148043
6.9%
20135026
 
4.3%
20123753
 
3.2%
20114311
 
3.7%

sale_yr
Real number (ℝ≥0)

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2018.059692
Minimum2016
Maximum2021
Zeros0
Zeros (%)0.0%
Memory size907.5 KiB
2021-04-02T00:44:42.835751image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum2016
5-th percentile2016
Q12017
median2018
Q32019
95-th percentile2020
Maximum2021
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.445587991
Coefficient of variation (CV)0.0007163256852
Kurtosis-1.1221707
Mean2018.059692
Median Absolute Deviation (MAD)1
Skewness0.1665534896
Sum234391579
Variance2.089724641
MonotocityNot monotonic
2021-04-02T00:44:43.000339image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
201728366
24.4%
201823442
20.2%
202022423
19.3%
201919480
16.8%
201619267
16.6%
20213169
 
2.7%
ValueCountFrequency (%)
201619267
16.6%
201728366
24.4%
201823442
20.2%
201919480
16.8%
202022423
19.3%
20213169
 
2.7%
ValueCountFrequency (%)
20213169
 
2.7%
202022423
19.3%
201919480
16.8%
201823442
20.2%
201728366
24.4%
201619267
16.6%

remaining_lease
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct168
Distinct (%)0.2%
Missing32926
Missing (%)28.3%
Infinite0
Infinite (%)0.0%
Mean126.9678807
Minimum3
Maximum900
Zeros0
Zeros (%)0.0%
Memory size907.5 KiB
2021-04-02T00:44:43.388192image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile75
Q190
median96
Q398
95-th percentile723
Maximum900
Range897
Interquartile range (IQR)8

Descriptive statistics

Standard deviation151.1697154
Coefficient of variation (CV)1.190613835
Kurtosis14.59101237
Mean126.9678807
Median Absolute Deviation (MAD)2
Skewness4.058403998
Sum10566394
Variance22852.28285
MonotocityNot monotonic
2021-04-02T00:44:43.734883image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9816218
14.0%
9714104
12.1%
969052
 
7.8%
954327
 
3.7%
993870
 
3.3%
902688
 
2.3%
912619
 
2.3%
932614
 
2.3%
922514
 
2.2%
941967
 
1.7%
Other values (158)23248
20.0%
(Missing)32926
28.3%
ValueCountFrequency (%)
33
< 0.1%
41
 
< 0.1%
147
< 0.1%
156
< 0.1%
165
< 0.1%
174
< 0.1%
184
< 0.1%
211
 
< 0.1%
262
 
< 0.1%
273
< 0.1%
ValueCountFrequency (%)
9001
 
< 0.1%
8951
 
< 0.1%
8851
 
< 0.1%
8797
< 0.1%
87813
< 0.1%
87712
< 0.1%
8765
 
< 0.1%
8754
 
< 0.1%
8744
 
< 0.1%
8734
 
< 0.1%

nearest_mrt
Categorical

HIGH CARDINALITY

Distinct137
Distinct (%)0.1%
Missing456
Missing (%)0.4%
Memory size8.6 MiB
Clementi MRT Station
 
5650
Potong Pasir MRT Station
 
4714
Kovan MRT Station
 
3725
Hougang MRT Station
 
2896
Tampines West MRT Station
 
2759
Other values (132)
95947 

Length

Max length29
Median length21
Mean length20.90818646
Min length16

Characters and Unicode

Total characters2418889
Distinct characters49
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowTampines West MRT Station
2nd rowBedok Reservoir MRT Station
3rd rowTampines West MRT Station
4th rowBedok Reservoir MRT Station
5th rowBedok Reservoir MRT Station
ValueCountFrequency (%)
Clementi MRT Station5650
 
4.9%
Potong Pasir MRT Station4714
 
4.1%
Kovan MRT Station3725
 
3.2%
Hougang MRT Station2896
 
2.5%
Tampines West MRT Station2759
 
2.4%
Queenstown MRT Station2717
 
2.3%
Sembawang MRT Station2636
 
2.3%
Bedok MRT Station2506
 
2.2%
Dakota MRT Station2489
 
2.1%
Eunos MRT Station2411
 
2.1%
Other values (127)83188
71.6%
2021-04-02T00:44:44.416381image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
station115691
29.2%
mrt102796
26.0%
lrt12895
 
3.3%
pasir7614
 
1.9%
clementi5650
 
1.4%
potong4714
 
1.2%
tampines4527
 
1.1%
kovan3725
 
0.9%
bedok3265
 
0.8%
park3005
 
0.8%
Other values (169)131916
33.3%

Most occurring characters

ValueCountFrequency (%)
t280782
11.6%
280107
11.6%
a223906
 
9.3%
n203827
 
8.4%
o189892
 
7.9%
i173679
 
7.2%
T129513
 
5.4%
S127736
 
5.3%
R122551
 
5.1%
M108270
 
4.5%
Other values (39)578626
23.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1510768
62.5%
Uppercase Letter627841
26.0%
Space Separator280107
 
11.6%
Dash Punctuation173
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
T129513
20.6%
S127736
20.3%
R122551
19.5%
M108270
17.2%
P21127
 
3.4%
L19314
 
3.1%
K17436
 
2.8%
B15679
 
2.5%
C13313
 
2.1%
H8311
 
1.3%
Other values (14)44591
 
7.1%
ValueCountFrequency (%)
t280782
18.6%
a223906
14.8%
n203827
13.5%
o189892
12.6%
i173679
11.5%
e86761
 
5.7%
r52944
 
3.5%
g43926
 
2.9%
s33607
 
2.2%
l32243
 
2.1%
Other values (13)189201
12.5%
ValueCountFrequency (%)
280107
100.0%
ValueCountFrequency (%)
-173
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2138609
88.4%
Common280280
 
11.6%

Most frequent character per script

ValueCountFrequency (%)
t280782
13.1%
a223906
10.5%
n203827
9.5%
o189892
 
8.9%
i173679
 
8.1%
T129513
 
6.1%
S127736
 
6.0%
R122551
 
5.7%
M108270
 
5.1%
e86761
 
4.1%
Other values (37)491692
23.0%
ValueCountFrequency (%)
280107
99.9%
-173
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII2418889
100.0%

Most frequent character per block

ValueCountFrequency (%)
t280782
11.6%
280107
11.6%
a223906
 
9.3%
n203827
 
8.4%
o189892
 
7.9%
i173679
 
7.2%
T129513
 
5.4%
S127736
 
5.3%
R122551
 
5.1%
M108270
 
4.5%
Other values (39)578626
23.9%

nearest_mrt_dist
Real number (ℝ≥0)

Distinct2809
Distinct (%)2.4%
Missing456
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean732.1908643
Minimum12.31416907
Maximum3846.761956
Zeros0
Zeros (%)0.0%
Memory size907.5 KiB
2021-04-02T00:44:44.774613image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum12.31416907
5-th percentile125.4554316
Q1356.9170077
median643.1972738
Q3981.5382877
95-th percentile1679.890005
Maximum3846.761956
Range3834.447787
Interquartile range (IQR)624.62128

Descriptive statistics

Standard deviation490.0188848
Coefficient of variation (CV)0.6692502033
Kurtosis0.8742301879
Mean732.1908643
Median Absolute Deviation (MAD)297.4412712
Skewness1.005165477
Sum84707893.28
Variance240118.5074
MonotocityNot monotonic
2021-04-02T00:44:45.010224image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
717.82346583390
 
2.9%
645.75914811715
 
1.5%
159.90250011383
 
1.2%
856.82362421375
 
1.2%
478.56466441202
 
1.0%
477.37355021125
 
1.0%
807.22028221078
 
0.9%
743.2810752967
 
0.8%
262.9989849966
 
0.8%
1403.759201879
 
0.8%
Other values (2799)101611
87.5%
ValueCountFrequency (%)
12.314169076
 
< 0.1%
37.295438731
 
< 0.1%
45.005497681
 
< 0.1%
53.5268065245
 
< 0.1%
60.66151122404
0.3%
63.488538148
 
< 0.1%
66.978818474
 
< 0.1%
68.2731285737
 
< 0.1%
71.121674614
 
< 0.1%
75.5952491572
 
0.1%
ValueCountFrequency (%)
3846.7619563
 
< 0.1%
3373.03576115
 
< 0.1%
3308.06767934
< 0.1%
3292.59878147
< 0.1%
3124.732842
 
< 0.1%
3103.01264923
< 0.1%
3051.26189113
 
< 0.1%
3021.5140022
 
< 0.1%
2994.11479932
< 0.1%
2894.8958121
 
< 0.1%

nearest_mall
Categorical

HIGH CARDINALITY

Distinct137
Distinct (%)0.1%
Missing456
Missing (%)0.4%
Memory size7.9 MiB
The Poiz
 
7200
The Clementi Mall
 
4218
Tiong Bahru Plaza
 
3145
Anchorpoint
 
3137
Our Tampines Hub
 
2902
Other values (132)
95089 

Length

Max length31
Median length14
Mean length14.0438755
Min length3

Characters and Unicode

Total characters1624750
Distinct characters62
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowOur Tampines Hub
2nd rowEast Village
3rd rowOur Tampines Hub
4th rowEast Village
5th rowEast Village
ValueCountFrequency (%)
The Poiz7200
 
6.2%
The Clementi Mall4218
 
3.6%
Tiong Bahru Plaza3145
 
2.7%
Anchorpoint3137
 
2.7%
Our Tampines Hub2902
 
2.5%
Eastpoint Mall2703
 
2.3%
Heartland Mall2412
 
2.1%
The Seletar Mall2292
 
2.0%
KINEX2288
 
2.0%
Sembawang Shopping Centre2197
 
1.9%
Other values (127)83197
71.6%
2021-04-02T00:44:45.615678image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
mall26397
 
9.9%
the18332
 
6.9%
plaza14789
 
5.6%
centre13452
 
5.1%
shopping11335
 
4.3%
poiz7200
 
2.7%
square5173
 
1.9%
clementi4843
 
1.8%
serangoon4459
 
1.7%
point4302
 
1.6%
Other values (160)155102
58.4%

Most occurring characters

ValueCountFrequency (%)
a163430
 
10.1%
149693
 
9.2%
e136998
 
8.4%
l118024
 
7.3%
n111806
 
6.9%
o84681
 
5.2%
i81936
 
5.0%
t75327
 
4.6%
r73449
 
4.5%
h48064
 
3.0%
Other values (52)581342
35.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1179927
72.6%
Uppercase Letter280300
 
17.3%
Space Separator149693
 
9.2%
Decimal Number13965
 
0.9%
Other Punctuation862
 
0.1%
Math Symbol3
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
P36796
13.1%
T32207
11.5%
M31130
11.1%
S30084
10.7%
C28802
10.3%
H14721
 
5.3%
B12993
 
4.6%
W11221
 
4.0%
E10926
 
3.9%
V10692
 
3.8%
Other values (16)60728
21.7%
ValueCountFrequency (%)
a163430
13.9%
e136998
11.6%
l118024
10.0%
n111806
9.5%
o84681
 
7.2%
i81936
 
6.9%
t75327
 
6.4%
r73449
 
6.2%
h48064
 
4.1%
g44584
 
3.8%
Other values (16)241628
20.5%
ValueCountFrequency (%)
15601
40.1%
24109
29.4%
02039
 
14.6%
81108
 
7.9%
3625
 
4.5%
9483
 
3.5%
ValueCountFrequency (%)
'674
78.2%
.188
 
21.8%
ValueCountFrequency (%)
149693
100.0%
ValueCountFrequency (%)
+3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1460227
89.9%
Common164523
 
10.1%

Most frequent character per script

ValueCountFrequency (%)
a163430
 
11.2%
e136998
 
9.4%
l118024
 
8.1%
n111806
 
7.7%
o84681
 
5.8%
i81936
 
5.6%
t75327
 
5.2%
r73449
 
5.0%
h48064
 
3.3%
g44584
 
3.1%
Other values (42)521928
35.7%
ValueCountFrequency (%)
149693
91.0%
15601
 
3.4%
24109
 
2.5%
02039
 
1.2%
81108
 
0.7%
'674
 
0.4%
3625
 
0.4%
9483
 
0.3%
.188
 
0.1%
+3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1624750
100.0%

Most frequent character per block

ValueCountFrequency (%)
a163430
 
10.1%
149693
 
9.2%
e136998
 
8.4%
l118024
 
7.3%
n111806
 
6.9%
o84681
 
5.2%
i81936
 
5.0%
t75327
 
4.6%
r73449
 
4.5%
h48064
 
3.0%
Other values (52)581342
35.8%

nearest_mall_dist
Real number (ℝ≥0)

Distinct2803
Distinct (%)2.4%
Missing456
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean780.4680342
Minimum0
Maximum3456.159897
Zeros1089
Zeros (%)0.9%
Memory size907.5 KiB
2021-04-02T00:44:45.814723image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile131.0174774
Q1452.9956561
median711.64741
Q31025.084616
95-th percentile1627.46941
Maximum3456.159897
Range3456.159897
Interquartile range (IQR)572.08896

Descriptive statistics

Standard deviation476.1707895
Coefficient of variation (CV)0.6101092788
Kurtosis2.025828171
Mean780.4680342
Median Absolute Deviation (MAD)287.465676
Skewness1.044519764
Sum90293127.35
Variance226738.6208
MonotocityNot monotonic
2021-04-02T00:44:46.056717image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
947.69150383390
 
2.9%
567.07462041715
 
1.5%
988.71722621383
 
1.2%
543.06156921375
 
1.2%
270.59059671202
 
1.0%
876.3358621125
 
1.0%
01089
 
0.9%
650.59691171078
 
0.9%
874.2244184967
 
0.8%
491.6662136966
 
0.8%
Other values (2793)101401
87.3%
ValueCountFrequency (%)
01089
0.9%
2.204837952 × 101113
 
< 0.1%
1.581505939 × 1092
 
< 0.1%
1.581531395 × 1092
 
< 0.1%
1.581537041 × 10980
 
0.1%
1.581539195 × 10918
 
< 0.1%
1.581539426 × 10910
 
< 0.1%
1.581541594 × 10917
 
< 0.1%
1.581596948 × 1099
 
< 0.1%
2.60547606411
 
< 0.1%
ValueCountFrequency (%)
3456.1598975
 
< 0.1%
3303.2103573
 
< 0.1%
3278.9011815
 
< 0.1%
3258.817247
< 0.1%
3239.77208434
< 0.1%
3210.9283663
 
< 0.1%
3124.4812154
 
< 0.1%
3124.37668975
0.1%
3096.24516735
< 0.1%
3067.90378923
 
< 0.1%

nearest_cbd_dist
Real number (ℝ≥0)

Distinct2809
Distinct (%)2.4%
Missing456
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean9347.902679
Minimum121.0838467
Maximum20416.51831
Zeros0
Zeros (%)0.0%
Memory size907.5 KiB
2021-04-02T00:44:46.335743image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum121.0838467
5-th percentile2356.218549
Q15624.75336
median9541.864938
Q312914.5806
95-th percentile17043.35729
Maximum20416.51831
Range20295.43446
Interquartile range (IQR)7289.827239

Descriptive statistics

Standard deviation4572.110706
Coefficient of variation (CV)0.4891055099
Kurtosis-0.9160661678
Mean9347.902679
Median Absolute Deviation (MAD)3504.116114
Skewness0.1316024064
Sum1081468209
Variance20904196.31
MonotocityNot monotonic
2021-04-02T00:44:46.803569image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6102.6673633390
 
2.9%
12914.58061715
 
1.5%
7022.6948281383
 
1.2%
11229.748261375
 
1.2%
5328.8871551202
 
1.0%
7947.9686251125
 
1.0%
11031.876491078
 
0.9%
10309.86348967
 
0.8%
15441.38343966
 
0.8%
9655.259237879
 
0.8%
Other values (2799)101611
87.5%
ValueCountFrequency (%)
121.08384678
 
< 0.1%
144.88039185
 
0.2%
179.03656288
 
< 0.1%
227.411350478
 
0.1%
432.132959947
 
< 0.1%
494.595056756
 
< 0.1%
503.6405818516
0.4%
514.7186173109
 
0.1%
581.185946827
 
< 0.1%
696.9791832
 
< 0.1%
ValueCountFrequency (%)
20416.518314
 
< 0.1%
19970.3026479
0.1%
19962.58227
 
< 0.1%
19880.208962
 
< 0.1%
19859.495742
 
< 0.1%
19833.83672
 
< 0.1%
19513.233499
 
< 0.1%
19486.901891
 
< 0.1%
19437.656749
 
< 0.1%
19421.84705120
0.1%

Interactions

2021-04-02T00:44:06.147815image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:06.343173image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:06.521378image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:06.708928image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:06.876279image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:07.068321image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:07.243327image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:07.420120image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:07.611395image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:07.803144image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:07.999303image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:08.184218image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:08.365357image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:08.554299image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:08.723995image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:08.922362image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:09.122769image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:09.336070image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:09.570188image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:09.764133image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:09.963461image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:10.254185image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:10.431198image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:10.612295image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:10.775454image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:10.966270image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:11.137999image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:11.323055image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:11.548676image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:11.783979image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:12.001879image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:12.280732image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:12.538918image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:12.822268image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:13.034958image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:13.281642image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:13.483608image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:13.670773image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:13.920644image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:14.148059image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:14.372588image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:14.556931image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:14.743243image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:15.577852image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:15.786765image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:15.995579image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:16.186965image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:16.361763image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:16.611757image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:16.950080image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:17.277245image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:17.480010image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:17.674694image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:17.872835image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:18.080848image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:18.340261image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:18.663099image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:18.873592image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:19.085547image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:19.509654image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:19.822742image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:20.090617image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:20.358735image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:20.549122image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:20.797452image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:21.014178image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:21.229238image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:21.479504image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:21.760886image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:22.029198image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:22.400158image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:22.632824image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:22.909745image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:23.126988image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:23.322101image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:23.509251image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:23.756278image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:23.940704image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:24.157610image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:24.360363image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:24.598490image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:24.806966image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:25.011821image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:25.211587image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:25.637335image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:25.854125image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:26.121990image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:26.440900image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:26.802068image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:27.110905image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:27.373670image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:27.612858image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:27.942872image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:28.196858image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:28.448843image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:28.670331image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:28.936685image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:29.221448image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:29.449636image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:29.697969image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:29.955315image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:30.203752image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:30.450230image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:30.702836image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:30.967088image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:31.197655image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:31.460177image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:31.703905image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:31.949408image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-04-02T00:44:32.221666image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Correlations

2021-04-02T00:44:47.293210image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-04-02T00:44:48.403686image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-04-02T00:44:48.877223image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-04-02T00:44:49.218774image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-04-02T00:44:49.539237image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-04-02T00:44:32.989517image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
A simple visualization of nullity by column.
2021-04-02T00:44:33.873570image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-04-02T00:44:35.002910image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-04-02T00:44:35.438518image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

TypePostal DistrictMarket SegmentType of SalePrice ($)Area (Sqft)Type of AreaFloor LevelUnit Price ($psf)tenure_yrs_cleanlease_commencementsale_yrremaining_leasenearest_mrtnearest_mrt_distnearest_mallnearest_mall_distnearest_cbd_dist
0Executive Condominium18OCRResale10600001173Strata01 to 0590399.02011.0202189.0Tampines West MRT Station1351.020868Our Tampines Hub1377.88150311514.460859
1Condominium16OCRResale12850001227Strata11 to 15104799.01996.0202174.0Bedok Reservoir MRT Station263.129128East Village1323.30168911255.388399
2Executive Condominium18OCRResale900000958Strata11 to 1593999.02011.0202189.0Tampines West MRT Station1351.020868Our Tampines Hub1377.88150311514.460859
3Condominium16OCRResale12800001227Strata16 to 20104399.01996.0202174.0Bedok Reservoir MRT Station263.129128East Village1323.30168911255.388399
4Condominium16OCRResale15800002099Strata16 to 2075399.01996.0202174.0Bedok Reservoir MRT Station263.129128East Village1323.30168911255.388399
5Executive Condominium18OCRResale850000958Strata01 to 0588799.02011.0202189.0Tampines West MRT Station1351.020868Our Tampines Hub1377.88150311514.460859
6Executive Condominium18OCRResale10880001130Strata11 to 1596399.02011.0202189.0Tampines West MRT Station1351.020868Our Tampines Hub1377.88150311514.460859
7Condominium16OCRResale850000893Strata06 to 1095199.01996.0202174.0Bedok Reservoir MRT Station263.129128East Village1323.30168911255.388399
8Condominium16OCRResale678000527Strata01 to 05128599.02011.0202189.0Bedok North MRT Station466.298206Djitsun Mall1613.9523139911.264731
9Condominium16OCRResale10700001206Strata11 to 1588899.01996.0202174.0Bedok Reservoir MRT Station263.129128East Village1323.30168911255.388399

Last rows

TypePostal DistrictMarket SegmentType of SalePrice ($)Area (Sqft)Type of AreaFloor LevelUnit Price ($psf)tenure_yrs_cleanlease_commencementsale_yrremaining_leasenearest_mrtnearest_mrt_distnearest_mallnearest_mall_distnearest_cbd_dist
116137Apartment23OCRResale9800001302Strata16 to 2075299.01994.0201677.0Bukit Panjang LRT Station212.964958Hillion Mall265.36548314337.480919
116138Apartment23OCRResale750000915Strata06 to 1082099.01994.0201677.0Bukit Panjang LRT Station212.964958Hillion Mall265.36548314337.480919
116139Apartment23OCRResale750000904Strata16 to 2082999.01994.0201677.0Bukit Panjang LRT Station212.964958Hillion Mall265.36548314337.480919
116140Apartment23OCRResale760000915Strata06 to 1083199.01994.0201677.0Bukit Panjang LRT Station212.964958Hillion Mall265.36548314337.480919
116141Apartment23OCRResale700000807Strata11 to 1586799.01994.0201677.0Bukit Panjang LRT Station212.964958Hillion Mall265.36548314337.480919
116142Apartment23OCRResale735000818Strata06 to 1089899.01994.0201677.0Bukit Panjang LRT Station212.964958Hillion Mall265.36548314337.480919
116143Apartment23OCRResale785000915Strata11 to 1585899.01994.0201677.0Bukit Panjang LRT Station212.964958Hillion Mall265.36548314337.480919
116144Apartment23OCRResale720000818Strata06 to 1088099.01994.0201677.0Bukit Panjang LRT Station212.964958Hillion Mall265.36548314337.480919
116145Apartment23OCRResale9000001313Strata06 to 1068599.01994.0201677.0Bukit Panjang LRT Station212.964958Hillion Mall265.36548314337.480919
116146Apartment23OCRResale805000915Strata16 to 2088099.01994.0201677.0Bukit Panjang LRT Station212.964958Hillion Mall265.36548314337.480919

Duplicate rows

Most frequent

TypePostal DistrictMarket SegmentType of SalePrice ($)Area (Sqft)Type of AreaFloor LevelUnit Price ($psf)tenure_yrs_cleanlease_commencementsale_yrremaining_leasenearest_mrtnearest_mrt_distnearest_mallnearest_mall_distnearest_cbd_distcount
701Apartment13RCRNew Sale1796000958Strata06 to 10187599.02017.0202096.0Woodleigh MRT Station103.796837The Poiz908.6245306715.32838020
2045Condominium5OCRNew Sale550000463Strata11 to 15118899.02015.0201698.0Clementi MRT Station1543.726335The Clementi Mall1376.92642111759.62354819
2065Condominium5OCRNew Sale725000603Strata11 to 15120399.02015.0201698.0Clementi MRT Station1543.726335The Clementi Mall1376.92642111759.62354818
2043Condominium5OCRNew Sale550000463Strata06 to 10118899.02015.0201698.0Clementi MRT Station1543.726335The Clementi Mall1376.92642111759.62354817
90Apartment3RCRNew Sale1180000431Strata31 to 35274199.02019.0202098.0Outram Park MRT Station198.124702People's Park Complex372.3453571306.39463011
482Apartment7CCRNew Sale1002000409Strata01 to 05245099.02019.0202098.0Bugis MRT Station356.917008Bugis Cube35.8530661925.18779010
483Apartment7CCRNew Sale1022400409Strata01 to 05250099.02019.0202098.0Bugis MRT Station356.917008Bugis Cube35.8530661925.18779010
560Apartment10CCRNew Sale1118000484Strata06 to 10230899.02018.0202097.0Sixth Avenue MRT Station170.734619Grandstand918.9319498076.42780410
2395Condominium13RCRNew Sale793000463Strata01 to 05171399.02017.0201898.0Woodleigh MRT Station109.254323The Poiz846.0940026676.11720510
2876Condominium18OCRNew Sale653000463Strata01 to 05141199.02018.0201998.0Simei MRT Station645.759148Eastpoint Mall567.07462012914.58059910